COMP124 � Week 6 Linux

History

Linux is an open-source OS
It's usually packaged into distributions that all use the same Kernel but add other open-source system software and libraries

Kernel was developed by Linus Torvalds

Inspired by Minix (itself based on Unix)
Packaged with GNU versions of Unix System Software

Linux System Architecture

User Mode
- User Software
- System Components
  - Daemon Processes
  - Window Manager
  - Graphics
  - API Libraries
- C Standard Library (~2000 subroutines)
Kernel Mode
- Kernel System Call Interface (SCI) (~380 system calls)
- Kernel (Processes, Memory, Files, Devices, Networking) & Kernel Modules (Device Drivers, etc.)
Hardware

System Calls

User programs need to access I/O devices, to interact with keyboard/mouse/disk/network, however the protection ring prevents direct access

OSs provide a collection of system calls

Often implemented as an interrupt
Along with other useful functions, these form the application programmer interface (API) of the OS

Library code is included of imported at the top of source code

Provides a wrapper that makes system calls look like ordinary subroutine calls
Libraries (user mode) use system calls (kernel mode) to carry out privileged tasks

Kernel Modules

The kernel is privileged (in every OS); it has no restrictions for any action
Device drivers also need privileged access to hardware
Linux has trusted device drivers built into the kernel itself

	Monolithic Kernel	Modular Kernel
Advantages	- All drivers included when kernel is compiled	- Specific drivers loaded when the system boots up
Disadvantages	- Kernel image is very big on disk and in memory - Need to recompile kernel to add new drivers or functionality	- Fragmentation of kernel memory as file systems and modules are loaded - Security and stability risk from loading bad modules

Graphical and Terminal Shells

The original Unix/Linux shell was purely text based

Type a command and press enter
See result as text in a single scrolling terminal

Newer Linux distributions include various graphical shells

WIMP (Windows, Icons, Menus, Pointer)
GUI (Graphical User Interface)
User can install and boot into their preferred desktop shell (KDE, Gnome, etc.)
Shell and Window Manager run as user level processes

The Uni's Linux Farm

Sixteen servers called lxfarm01, lxfarm02, ... , lxfarm16
Connect via lxfarmXX.csc.liv.ac.uk
Each server has 8 physical CPUs (each with 4 cores)
Use MWS login details, may need to be on campus or using the VPN service

Directories

There are four special directory names

~ - Home directory
. - Current directory
.. - Parent of current directory
/ - Root directory

Paths can be absolute (from root) or relative (to where you are)

File Permissions

Permissions are shown as a 10 character string split into four parts:

eg. drwxr-x--- or -rw-r--r--
First character indicates directories and other special files
Next three characters for user permissions (read, write, execute)
Then three for group permissions
Then three for other permissions (ie. everyone on the system)
Every user belongs to a group (can be in multiple groups)
Every process is owned by a user and a group (even system processes)

Root User

The user root has full access to everything
Many system files and background processes are owned by the root user
Root user permissions can be requested by users using the sudo command

Setting Octal and Mnemonic Permissions

Use the chmod command to change the permissions of a file
To add write permission for the group: chmod g+w filename
To remove read permission for other: chmod o-r filename
You can also use octal numbers to change permissions quickly

(add up)	User	Group	Other
Read	4	4	4
Write	2	2	2
Execute	1	1	1
Most files are set to 640 and most directories to 750

640 is -rw-r----- (user read/write, group read)
750 is drwxr-x--- (user read/write/exec, group read/exec)

Assembly in Linux

Assembly code can be written and compiled in Linux with NASM

Put pure assembly code into a text file
Use interrupts to trigger system calls to read, write, exit, etc.
Compile with command line options
Link into an executable with further options
This is much more complicated than using Visual Studio
To compile and run assembly saved in hello.asm

nasm -f elf32 hello.asm
ld -m elf_i386 hello.o -o hello
./hello

Code Example

Some code to display some text would look like:

global _start                     ; Tell linker where to start
section .data                     ; Constants go in .data section
	msg db 'Hello World!', 10, 0  ; db = define bytes (10 = \n)
	len dd 13                     ; dd = define dword (32 bits)

section .text                     ; Code goes in .text section
	_start:
		mov eax, 4                ; 4 = sys_write
		mov ebx, 1                ; 1 = file handle for STDOUT
		mov ecx, msg              ; Address of string
		mov edx, [len]            ; Length of string
		int 0x80                  ; Trigger interrupt
		mov eax, 1                ; 1 = sys_exit
		int 0x80                  ; 0x80 is 128 decimal

Process Creation

Processes are created (spawned) by other processes
The original process is the parent, the new process is the child
All running processes form a tree structure (which can be shown in Linux using the pstree command)
Everything has systemd as the top-level ancestor

There are several system calls in Linux that allow a process to spawn a child:

exec()
- Allows the process to execute another process
- Child replaces (overwrites) parent in memory and PCB
fork()
- Spawns a new clone of the process
- Both parent and child continue to run
wait()
- Called by parent process
- Blocks until child process terminates

Fork

The fork() system call returns one of three possible values

<0 (negative) - If the child could not be created (failure)
=0 (zero) - In the child process
>0 (positive) - In the parent process (child process ID)

int pid = fork();
// Call returns in each of the two processes
if(pid == 0) {
	printf("I'm the child process");
	// Will usually call exec() to load its own code
} else {
	printf("Im the parent and my child's ID is %d", pid);
}

The First Process

ROM stores a small program that runs a bootloader when a system is first turned on

Linux systems use GRUB (GNU Grand Unified Bootloader)
Loads kernel image from disk and starts fetch-execute cycle from first instruction
The first process to run is called systemd
It's process ID (PID) is 1
Spawns all other processes required by the kernel
Can be configured for various targets (eg. server or desktop)

Continues to run as a background process (daemon)

Offers on-demand spawning of other services
Maintains logfiles to record system activity
Keeps track of other processes and kernel settings

Shell Login

The sshd daemon runs in the background, waiting for incoming connections (spawned by systemd)

A ssh client is used to connect to a Linux server
The sshd daemon uses fork() to spawn a child process
The child uses exec() to run a login process
The login process checks credentials
Then it uses exec() to run preferred shell process

Processes are being created all the time

Everything is done via fork() and exec() system calls
This is the same concept for Windows and any other major OS

Running a Shell Command

When a command is type into the shell:

The shell uses fork() to create a child process
That child uses exec() to run the command we typed in
(This can be shown with the ps command)

A desktop shell (GUI) does the same thing but with clicks to spawn processes

Zombies and Orphans

Parent processes usually wait for their children to die

Processor manager will tell the parent that the child terminated
Sends a SIGCHLD signal to the parent
Clean-up is not done until the parent acknowledges it no longer needs its child
If the termination of a child is not acknowledged by the parent:
The child becomes a zombie
It has finished but is still present in the process table
If a parent terminates before its children (eg. crashes)
The children become orphans
They are adopted by the systemd process

The systemd process periodically cleans up zombies and orphans in the process table

Daemon Processes

Daemon processes such as sshd and systemd

Usually the process name ends with 'd' to signify daemon
Not associated with shell or any user
Runs permanently in the background
Perform the background operations of the OS
Subsystem managers are daemon processes
Need their own time on the CPU, so must be scheduled
Usually run with a higher priority
Perform tasks requested by other processes

Processes in the Linux File System

Kernel stores housekeeping information in the /proc directory

Dynamic details about the current state of the kernel
Virtual file system called procfs (ie. not all real files on disk)
Subdirectory for each running process

For example:

Subdirectory	Purpose
/proc/PID/	Stores all the details and status of process PID
/proc/PID/cmdline	The text that was typed to start the process
/proc/PID/fdinfo	The status of any open files used by the process
/proc/PID/status	The overall status of the process (lots of detail)
/proc/cpuinfo	Stores details about the physical CPU(s)
/proc/modules	Stores info about currently loaded kernel modules

There is a top command to see a dynamic (real-time) display of process details

Linux Signals

A process running from the terminal can usually be terminated by typing ^C (CTRL+C)

This sends an interrupt signal to the process
Process intercepts signal and responds by terminating
Signals can be sent between two processes with the signal() system call
We can send signals with the kill command at the shell prompt
There are various signals denoted by numbers and codes
For example, to terminate process 438
- kill -s SIGKILL 438

Responding to Signals

The process that receives a signal can respond in three ways

Perform the action requested
Ignore the signal completely
Catch the signal and run some other arbitrary code
The only signal that cannot be ignored or caught is SIGKILL (9)

Code	Number	Meaning
SIGINT	2	Interrupted from keyboard (via CTRL+C)
SIGKILL	9	Request to terminate process (cannot be ignored)
SIGTERM	15	Request to terminate process (might be ignored)
SIGCHLD	17	Indicates that a child process has terminated
SIGIO	29	Indicates that input or output is ready

Terminating Zombie Processes

Imagine a parent process is badly coded:

Spawns a child via fork()
But does not acknowledge when that child terminates (via wait() or SIGCHLD)

The child will become a zombie

Hangs around in process table doing nothing
Will take up some (minimal) system resources

You can try to resend SIGCHLD to the parent but it will probably ignore it
So you'll have to send SIGKILL to the parent to kill it

The parent will be terminated and systemd will adopt the zombie
Zombie will be cleaned up via the next periodic check

Inter-Process Communication (IPC)

Processes need to communicate with each other

To share, send, and receive data
To provide services (servers) to other processes (clients)
Types of IPC
Shared memory
Shared files
Pipes
Sockets

Shared memory and shared files allow two processes to access the same memory location or file at the same time

Introduces synchronisation issues
Needs to be coordinated by semaphores and locks

Pipes

A pipe is a form of IPC between two children of the same parent
Usually triggered by typing a command at the prompt

Join two processes with the | (pipe) symbol
Output from the first becomes input to the second
For example, you can list all kernel modules with cat /proc/modules and you can count the lines of a file with wc --lines filename
So you can find out how many kernel modules are running with
cat /proc/modules | wc --lines

Sockets

A socket is a form of IPC that can span multiple systems

One process is the server (daemon) listening for clients
Other process is the client that connects to the server
Communication is bidirectional (both can send/receive)
Implemented as special files in the file system
The processes don't need to be on the same machine
Don't need the same parent process (unlike pipes)
Typically provide internet services
Server (eg. httpd, maild, sshd) is running as a daemon
Clients connect when they need to get/send data

Client-Server Socket Handling

Server process uses the listen() system call to wait for clients

When a client connects, usually spawns child via fork()
Child process handles the communication then terminates
Parent process puts accept() in a loop to handle multiple incoming clients
The socket looks like a local file to the fread() and fwrite() system calls

There is a diagram for this at the end of lecture 12